PHACTS, a computational approach to classifying the lifestyle of phages
نویسندگان
چکیده
MOTIVATION Bacteriophages have two distinct lifestyles: virulent and temperate. The virulent lifestyle has many implications for phage therapy, genomics and microbiology. Determining which lifestyle a newly sequenced phage falls into is currently determined using standard culturing techniques. Such laboratory work is not only costly and time consuming, but also cannot be used on phage genomes constructed from environmental sequencing. Therefore, a computational method that utilizes the sequence data of phage genomes is needed. RESULTS Phage Classification Tool Set (PHACTS) utilizes a novel similarity algorithm and a supervised Random Forest classifier to make a prediction whether the lifestyle of a phage, described by its proteome, is virulent or temperate. The similarity algorithm creates a training set from phages with known lifestyles and along with the lifestyle annotation, trains a Random Forest to classify the lifestyle of a phage. PHACTS predictions are shown to have a 99% precision rate. AVAILABILITY AND IMPLEMENTATION PHACTS was implemented in the PERL programming language and utilizes the FASTA program (Pearson and Lipman, 1988) and the R programming language library 'Random Forest' (Liaw and Weiner, 2010). The PHACTS software is open source and is available as downloadable stand-alone version or can be accessed online as a user-friendly web interface. The source code, help files and online version are available at http://www.phantome.org/PHACTS/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Safety assessment of Staphylococcus phages of the family Myoviridae based on complete genome sequences
Staphylococcus phages of the Myoviridae family have a wide host range and potential applications in phage therapy. In this report, safety assessments of these phages were conducted based on their complete genome sequences. The complete genomes of Staphylococcus phages of the Myoviridae family were analyzed, and the Open Reading Frame (ORFs) were compared with a pool of virulence and antibiotic ...
متن کاملA New Statistical Approach for Recognizing and Classifying Patterns of Control Charts (RESEARCH NOTE)
Control chart pattern (CCP) recognition techniques are widely used to identify the potential process problems in modern industries. Recently, artificial neural network (ANN) –based techniques are very popular to recognize CCPs. However, finding the suitable architecture of an ANN-based CCP recognizer and its training process are time consuming and tedious. In addition, because of the black box ...
متن کاملEvaluating the efficiency and classifying the fuzzy data: A DEA based approach
Data envelopment analysis (DEA) has been proven as an efficient technique to evaluate the performance of homogeneous decision making units (DMUs) where multiple inputs and outputs exist. In the conventional applications of DEA, the data are considered as specific numerical values with explicit designation of being an input or output. However, the observed values of the data are sometimes imprec...
متن کاملDeveloping a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression
Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...
متن کاملA Sociological Study of the Effects of Lifestyle on Social Identity and its Effective Factors
The line between lifestyle and social identity in sociology is based on the distinction between traditional and modern society. In traditional societies lifestyle and social identity are based on their related features. However, in this regard, in terms of time precedence, modern societies mainly offer two different answers, each based on different theoretical approaches. The first and the olde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 28 شماره
صفحات -
تاریخ انتشار 2012